Overview

Dataset statistics

Number of variables15
Number of observations3498
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory232.4 KiB
Average record size in memory68.0 B

Variable types

Numeric11
Categorical4

Alerts

Model_year is highly correlated with MileageHigh correlation
Kilometers is highly correlated with Model_yearHigh correlation
Registration is highly correlated with df_index and 2 other fieldsHigh correlation
State is highly correlated with df_index and 2 other fieldsHigh correlation
Fuel_capacity is highly correlated with Company and 5 other fieldsHigh correlation
Price is highly correlated with Company and 2 other fieldsHigh correlation
df_index is highly correlated with Registration and 2 other fieldsHigh correlation
Company is highly correlated with Model_name and 4 other fieldsHigh correlation
Model_name is highly correlated with Company and 2 other fieldsHigh correlation
Fuel_Type is highly correlated with Mileage and 2 other fieldsHigh correlation
Mileage is highly correlated with Company and 5 other fieldsHigh correlation
Seating_capacity is highly correlated with Company and 2 other fieldsHigh correlation
City is highly correlated with df_index and 2 other fieldsHigh correlation
df_index is uniformly distributed Uniform
df_index has unique values Unique
City has 216 (6.2%) zeros Zeros

Reproduction

Analysis started2022-12-05 10:19:38.141217
Analysis finished2022-12-05 10:20:18.554720
Duration40.41 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct3498
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1749.829617
Minimum0
Maximum3500
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size27.5 KiB
2022-12-05T15:50:18.825995image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile174.85
Q1875.25
median1749.5
Q32624.75
95-th percentile3324.15
Maximum3500
Range3500
Interquartile range (IQR)1749.5

Descriptive statistics

Standard deviation1010.48178
Coefficient of variation (CV)0.5774743837
Kurtosis-1.199726532
Mean1749.829617
Median Absolute Deviation (MAD)875
Skewness0.0002647233067
Sum6120904
Variance1021073.427
MonotonicityStrictly increasing
2022-12-05T15:50:19.029450image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
23371
 
< 0.1%
23261
 
< 0.1%
23271
 
< 0.1%
23281
 
< 0.1%
23291
 
< 0.1%
23301
 
< 0.1%
23311
 
< 0.1%
23321
 
< 0.1%
23331
 
< 0.1%
Other values (3488)3488
99.7%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
35001
< 0.1%
34991
< 0.1%
34981
< 0.1%
34971
< 0.1%
34961
< 0.1%
34951
< 0.1%
34941
< 0.1%
34931
< 0.1%
34921
< 0.1%
34911
< 0.1%

Company
Real number (ℝ≥0)

HIGH CORRELATION

Distinct17
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.848198971
Minimum0
Maximum16
Zeros2
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size13.8 KiB
2022-12-05T15:50:19.207975image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q14
median8
Q38
95-th percentile14
Maximum16
Range16
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.40802572
Coefficient of variation (CV)0.4976528477
Kurtosis0.1610136514
Mean6.848198971
Median Absolute Deviation (MAD)4
Skewness0.7479409017
Sum23955
Variance11.61463931
MonotonicityNot monotonic
2022-12-05T15:50:19.353134image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
81458
41.7%
4961
27.5%
3322
 
9.2%
12152
 
4.3%
14140
 
4.0%
2117
 
3.3%
1669
 
2.0%
1558
 
1.7%
752
 
1.5%
639
 
1.1%
Other values (7)130
 
3.7%
ValueCountFrequency (%)
02
 
0.1%
132
 
0.9%
2117
 
3.3%
3322
 
9.2%
4961
27.5%
533
 
0.9%
639
 
1.1%
752
 
1.5%
81458
41.7%
91
 
< 0.1%
ValueCountFrequency (%)
1669
 
2.0%
1558
 
1.7%
14140
 
4.0%
1319
 
0.5%
12152
 
4.3%
118
 
0.2%
1035
 
1.0%
91
 
< 0.1%
81458
41.7%
752
 
1.5%

Model_name
Real number (ℝ≥0)

HIGH CORRELATION

Distinct618
Distinct (%)17.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean283.2798742
Minimum0
Maximum617
Zeros3
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size13.8 KiB
2022-12-05T15:50:19.540674image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile14
Q1110
median279
Q3458
95-th percentile574
Maximum617
Range617
Interquartile range (IQR)348

Descriptive statistics

Standard deviation186.8394191
Coefficient of variation (CV)0.6595576889
Kurtosis-1.26594318
Mean283.2798742
Median Absolute Deviation (MAD)174
Skewness0.09500963119
Sum990913
Variance34908.96855
MonotonicityNot monotonic
2022-12-05T15:50:19.815434image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
553120
 
3.4%
458101
 
2.9%
4698
 
2.8%
992
 
2.6%
28385
 
2.4%
1768
 
1.9%
17247
 
1.3%
10839
 
1.1%
5138
 
1.1%
27937
 
1.1%
Other values (608)2773
79.3%
ValueCountFrequency (%)
03
 
0.1%
11
 
< 0.1%
21
 
< 0.1%
33
 
0.1%
421
 
0.6%
56
 
0.2%
62
 
0.1%
72
 
0.1%
82
 
0.1%
992
2.6%
ValueCountFrequency (%)
6171
 
< 0.1%
6168
0.2%
6152
 
0.1%
6141
 
< 0.1%
61316
0.5%
6122
 
0.1%
6111
 
< 0.1%
6107
0.2%
6098
0.2%
6082
 
0.1%

Model_year
Real number (ℝ≥0)

HIGH CORRELATION

Distinct15
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2017.276158
Minimum2008
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.8 KiB
2022-12-05T15:50:20.085227image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2008
5-th percentile2012
Q12016
median2018
Q32019
95-th percentile2021
Maximum2022
Range14
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.53616854
Coefficient of variation (CV)0.001257224267
Kurtosis0.1523105267
Mean2017.276158
Median Absolute Deviation (MAD)2
Skewness-0.7057338161
Sum7056432
Variance6.432150861
MonotonicityNot monotonic
2022-12-05T15:50:20.334561image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
2018682
19.5%
2019538
15.4%
2017499
14.3%
2020386
11.0%
2016334
9.5%
2021260
 
7.4%
2014244
 
7.0%
2015234
 
6.7%
2013112
 
3.2%
201293
 
2.7%
Other values (5)116
 
3.3%
ValueCountFrequency (%)
20081
 
< 0.1%
20094
 
0.1%
201045
 
1.3%
201147
 
1.3%
201293
 
2.7%
2013112
 
3.2%
2014244
7.0%
2015234
6.7%
2016334
9.5%
2017499
14.3%
ValueCountFrequency (%)
202219
 
0.5%
2021260
 
7.4%
2020386
11.0%
2019538
15.4%
2018682
19.5%
2017499
14.3%
2016334
9.5%
2015234
 
6.7%
2014244
 
7.0%
2013112
 
3.2%

Kilometers
Real number (ℝ≥0)

HIGH CORRELATION

Distinct3422
Distinct (%)97.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42222.7773
Minimum269
Maximum455601
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.8 KiB
2022-12-05T15:50:20.551978image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum269
5-th percentile8165.65
Q122052.5
median39245.5
Q359328.75
95-th percentile87893.25
Maximum455601
Range455332
Interquartile range (IQR)37276.25

Descriptive statistics

Standard deviation25492.86696
Coefficient of variation (CV)0.6037704904
Kurtosis19.77983252
Mean42222.7773
Median Absolute Deviation (MAD)18320.5
Skewness1.717933625
Sum147695275
Variance649886265.6
MonotonicityNot monotonic
2022-12-05T15:50:20.743466image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
203143
 
0.1%
427962
 
0.1%
315112
 
0.1%
111592
 
0.1%
288092
 
0.1%
76322
 
0.1%
634972
 
0.1%
193182
 
0.1%
578012
 
0.1%
281982
 
0.1%
Other values (3412)3477
99.4%
ValueCountFrequency (%)
2691
< 0.1%
4101
< 0.1%
10761
< 0.1%
10871
< 0.1%
11221
< 0.1%
12981
< 0.1%
13451
< 0.1%
14441
< 0.1%
15681
< 0.1%
16501
< 0.1%
ValueCountFrequency (%)
4556011
< 0.1%
2426141
< 0.1%
1104571
< 0.1%
1015261
< 0.1%
1001701
< 0.1%
1001621
< 0.1%
1000031
< 0.1%
999571
< 0.1%
998901
< 0.1%
998541
< 0.1%

Owner
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size198.3 KiB
0
2619 
1
823 
2
 
56

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3498
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
02619
74.9%
1823
 
23.5%
256
 
1.6%

Length

2022-12-05T15:50:20.916005image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-05T15:50:21.110523image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
02619
74.9%
1823
 
23.5%
256
 
1.6%

Most occurring characters

ValueCountFrequency (%)
02619
74.9%
1823
 
23.5%
256
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3498
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
02619
74.9%
1823
 
23.5%
256
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
Common3498
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
02619
74.9%
1823
 
23.5%
256
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII3498
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
02619
74.9%
1823
 
23.5%
256
 
1.6%

Transmission
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size198.3 KiB
1
2861 
0
637 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3498
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
12861
81.8%
0637
 
18.2%

Length

2022-12-05T15:50:21.244486image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-05T15:50:21.385283image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
12861
81.8%
0637
 
18.2%

Most occurring characters

ValueCountFrequency (%)
12861
81.8%
0637
 
18.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3498
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
12861
81.8%
0637
 
18.2%

Most occurring scripts

ValueCountFrequency (%)
Common3498
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
12861
81.8%
0637
 
18.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII3498
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12861
81.8%
0637
 
18.2%

Registration
Real number (ℝ≥0)

HIGH CORRELATION

Distinct318
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean147.0731847
Minimum0
Maximum317
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size13.8 KiB
2022-12-05T15:50:21.634167image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile20
Q177
median158
Q3222
95-th percentile301
Maximum317
Range317
Interquartile range (IQR)145

Descriptive statistics

Standard deviation90.45561488
Coefficient of variation (CV)0.6150381191
Kurtosis-1.139576989
Mean147.0731847
Median Absolute Deviation (MAD)72
Skewness0.1905185699
Sum514462
Variance8182.218263
MonotonicityNot monotonic
2022-12-05T15:50:21.898459image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
86145
 
4.1%
63112
 
3.2%
285108
 
3.1%
166106
 
3.0%
11997
 
2.8%
8894
 
2.7%
16085
 
2.4%
8784
 
2.4%
2881
 
2.3%
15880
 
2.3%
Other values (308)2506
71.6%
ValueCountFrequency (%)
01
 
< 0.1%
12
 
0.1%
26
0.2%
32
 
0.1%
41
 
< 0.1%
51
 
< 0.1%
61
 
< 0.1%
74
0.1%
86
0.2%
92
 
0.1%
ValueCountFrequency (%)
3171
 
< 0.1%
3162
 
0.1%
3152
 
0.1%
3141
 
< 0.1%
3131
 
< 0.1%
31211
0.3%
31119
0.5%
3101
 
< 0.1%
30913
0.4%
3086
 
0.2%

State
Real number (ℝ≥0)

HIGH CORRELATION

Distinct15
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.267295597
Minimum0
Maximum14
Zeros33
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size13.8 KiB
2022-12-05T15:50:22.130838image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median6
Q310
95-th percentile14
Maximum14
Range14
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.057602072
Coefficient of variation (CV)0.647424716
Kurtosis-0.9374076627
Mean6.267295597
Median Absolute Deviation (MAD)3
Skewness0.5048307918
Sum21923
Variance16.46413458
MonotonicityNot monotonic
2022-12-05T15:50:22.276449image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
4732
20.9%
6615
17.6%
1413
11.8%
10374
10.7%
13256
 
7.3%
3234
 
6.7%
2211
 
6.0%
14194
 
5.5%
12157
 
4.5%
9125
 
3.6%
Other values (5)187
 
5.3%
ValueCountFrequency (%)
033
 
0.9%
1413
11.8%
2211
 
6.0%
3234
 
6.7%
4732
20.9%
554
 
1.5%
6615
17.6%
798
 
2.8%
81
 
< 0.1%
9125
 
3.6%
ValueCountFrequency (%)
14194
 
5.5%
13256
7.3%
12157
 
4.5%
111
 
< 0.1%
10374
10.7%
9125
 
3.6%
81
 
< 0.1%
798
 
2.8%
6615
17.6%
554
 
1.5%

Fuel_Type
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size198.3 KiB
1
2923 
0
447 
2
 
128

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3498
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
12923
83.6%
0447
 
12.8%
2128
 
3.7%

Length

2022-12-05T15:50:22.433031image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-05T15:50:22.585136image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
12923
83.6%
0447
 
12.8%
2128
 
3.7%

Most occurring characters

ValueCountFrequency (%)
12923
83.6%
0447
 
12.8%
2128
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3498
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
12923
83.6%
0447
 
12.8%
2128
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Common3498
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
12923
83.6%
0447
 
12.8%
2128
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII3498
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12923
83.6%
0447
 
12.8%
2128
 
3.7%

Mileage
Real number (ℝ≥0)

HIGH CORRELATION

Distinct138
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.24419668
Minimum10.3
Maximum35.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size27.5 KiB
2022-12-05T15:50:22.740792image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum10.3
5-th percentile15.1
Q118.2
median20.2
Q322
95-th percentile25
Maximum35.6
Range25.3
Interquartile range (IQR)3.8

Descriptive statistics

Standard deviation3.261909491
Coefficient of variation (CV)0.1611281269
Kurtosis1.805088027
Mean20.24419668
Median Absolute Deviation (MAD)1.9
Skewness0.6259047211
Sum70814.2
Variance10.64005353
MonotonicityNot monotonic
2022-12-05T15:50:22.957180image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21.4167
 
4.8%
19.8154
 
4.4%
18.6142
 
4.1%
20.5141
 
4.0%
18.9114
 
3.3%
22100
 
2.9%
24.7100
 
2.9%
21.284
 
2.4%
1881
 
2.3%
2377
 
2.2%
Other values (128)2338
66.8%
ValueCountFrequency (%)
10.31
 
< 0.1%
10.41
 
< 0.1%
10.82
 
0.1%
10.91
 
< 0.1%
111
 
< 0.1%
11.63
 
0.1%
11.92
 
0.1%
1210
0.3%
12.11
 
< 0.1%
12.43
 
0.1%
ValueCountFrequency (%)
35.64
 
0.1%
33.57
 
0.2%
33.41
 
< 0.1%
32.34
 
0.1%
31.812
0.3%
31.59
0.3%
31.210
0.3%
30.55
 
0.1%
28.418
0.5%
28.16
 
0.2%

Fuel_capacity
Real number (ℝ≥0)

HIGH CORRELATION

Distinct24
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41.13750715
Minimum27
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.8 KiB
2022-12-05T15:50:23.129753image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum27
5-th percentile32
Q135
median40
Q345
95-th percentile60
Maximum80
Range53
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.212102131
Coefficient of variation (CV)0.1996256628
Kurtosis1.276642065
Mean41.13750715
Median Absolute Deviation (MAD)5
Skewness1.084564781
Sum143899
Variance67.43862141
MonotonicityNot monotonic
2022-12-05T15:50:23.356148image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
35891
25.5%
37471
13.5%
43352
 
10.1%
45315
 
9.0%
40303
 
8.7%
42211
 
6.0%
60202
 
5.8%
28120
 
3.4%
32109
 
3.1%
5596
 
2.7%
Other values (14)428
12.2%
ValueCountFrequency (%)
2750
 
1.4%
28120
 
3.4%
32109
 
3.1%
35891
25.5%
37471
13.5%
40303
 
8.7%
4110
 
0.3%
42211
 
6.0%
43352
 
10.1%
4437
 
1.1%
ValueCountFrequency (%)
802
 
0.1%
711
 
< 0.1%
7030
 
0.9%
661
 
< 0.1%
655
 
0.1%
623
 
0.1%
60202
5.8%
582
 
0.1%
5596
2.7%
5294
2.7%

Seating_capacity
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size198.3 KiB
5
3354 
7
 
115
4
 
22
6
 
6
8
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3498
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row5
2nd row5
3rd row5
4th row5
5th row5

Common Values

ValueCountFrequency (%)
53354
95.9%
7115
 
3.3%
422
 
0.6%
66
 
0.2%
81
 
< 0.1%

Length

2022-12-05T15:50:23.590519image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-05T15:50:23.823897image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
53354
95.9%
7115
 
3.3%
422
 
0.6%
66
 
0.2%
81
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
53354
95.9%
7115
 
3.3%
422
 
0.6%
66
 
0.2%
81
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3498
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
53354
95.9%
7115
 
3.3%
422
 
0.6%
66
 
0.2%
81
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common3498
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
53354
95.9%
7115
 
3.3%
422
 
0.6%
66
 
0.2%
81
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII3498
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
53354
95.9%
7115
 
3.3%
422
 
0.6%
66
 
0.2%
81
 
< 0.1%

City
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct12
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.341909663
Minimum0
Maximum11
Zeros216
Zeros (%)6.2%
Negative0
Negative (%)0.0%
Memory size13.8 KiB
2022-12-05T15:50:24.011394image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q38
95-th percentile11
Maximum11
Range11
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.544926982
Coefficient of variation (CV)0.8164442047
Kurtosis-0.9970423984
Mean4.341909663
Median Absolute Deviation (MAD)2
Skewness0.6864048815
Sum15188
Variance12.5665073
MonotonicityNot monotonic
2022-12-05T15:50:24.150023image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
3777
22.2%
1733
21.0%
10398
11.4%
2373
10.7%
0216
 
6.2%
11214
 
6.1%
8196
 
5.6%
4190
 
5.4%
6123
 
3.5%
9120
 
3.4%
Other values (2)158
 
4.5%
ValueCountFrequency (%)
0216
 
6.2%
1733
21.0%
2373
10.7%
3777
22.2%
4190
 
5.4%
5104
 
3.0%
6123
 
3.5%
754
 
1.5%
8196
 
5.6%
9120
 
3.4%
ValueCountFrequency (%)
11214
 
6.1%
10398
11.4%
9120
 
3.4%
8196
 
5.6%
754
 
1.5%
6123
 
3.5%
5104
 
3.0%
4190
 
5.4%
3777
22.2%
2373
10.7%

Price
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2883
Distinct (%)82.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean637968.8928
Minimum135099
Maximum2790699
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.8 KiB
2022-12-05T15:50:24.317266image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum135099
5-th percentile291369
Q1427449
median560349
Q3746711.5
95-th percentile1276104
Maximum2790699
Range2655600
Interquartile range (IQR)319262.5

Descriptive statistics

Standard deviation312140.1661
Coefficient of variation (CV)0.4892717649
Kurtosis3.242943785
Mean637968.8928
Median Absolute Deviation (MAD)150900
Skewness1.597687542
Sum2231615187
Variance9.743148332 × 1010
MonotonicityNot monotonic
2022-12-05T15:50:24.525675image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6871996
 
0.2%
3425996
 
0.2%
6517995
 
0.1%
6035995
 
0.1%
7041995
 
0.1%
8776995
 
0.1%
4173995
 
0.1%
4202994
 
0.1%
5228994
 
0.1%
4537994
 
0.1%
Other values (2873)3449
98.6%
ValueCountFrequency (%)
1350991
< 0.1%
1413991
< 0.1%
1529991
< 0.1%
1662991
< 0.1%
1674991
< 0.1%
1706991
< 0.1%
1714991
< 0.1%
1742991
< 0.1%
1812991
< 0.1%
1857991
< 0.1%
ValueCountFrequency (%)
27906991
< 0.1%
20282991
< 0.1%
20124991
< 0.1%
20024991
< 0.1%
19976991
< 0.1%
19416991
< 0.1%
19302991
< 0.1%
19227991
< 0.1%
19214991
< 0.1%
19158991
< 0.1%

Interactions

2022-12-05T15:50:15.024739image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:47.833281image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:50.090606image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:52.456121image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:55.185833image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:57.513820image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:59.797416image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:03.813278image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:07.291632image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:10.376898image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:12.800068image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:15.203806image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:48.045720image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:50.331991image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:52.734377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:55.451156image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:57.753179image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:00.000697image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:04.199310image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:07.612773image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:10.703539image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:13.028973image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:15.401306image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:48.294572image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:50.605262image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:53.372221image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:55.703482image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:57.933696image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:00.196629image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:04.642127image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:07.893025image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:10.955864image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:13.210500image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:15.573815image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:48.537951image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:50.799741image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:53.631559image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:55.950819image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:58.112239image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:00.707775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:04.886470image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:08.065573image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:11.178270image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:13.385030image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:15.763310image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:48.772836image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:50.980258image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:53.889866image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:56.139314image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:58.308589image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:01.022948image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:05.160740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:08.328519image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:11.361325image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:13.572629image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:16.327319image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:48.940389image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:51.143366image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:54.113266image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:56.314849image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:58.477918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:01.469572image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:05.584170image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:08.647214image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:11.544404image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:13.761057image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:16.560703image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:49.127887image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:51.312157image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:54.304790image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:56.504339image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:58.639483image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:01.845529image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:05.934232image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:09.057111image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:11.724923image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:13.939582image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:16.813543image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:49.300425image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:51.496629image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:54.487269image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:56.686939image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:58.822000image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:02.332763image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:06.271333image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:09.404183image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:11.909426image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:14.161984image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:16.987526image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:49.466016image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:51.737505image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:54.678387image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:56.862592image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:59.053375image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:02.952109image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:06.688730image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:09.597183image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:12.090998image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:14.401347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:17.168043image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:49.638099image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:51.981878image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:54.845942image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:57.066016image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:59.302708image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:03.144593image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:06.920116image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:09.925312image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:12.316361image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:14.630733image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:17.336591image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:49.840564image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:52.220753image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:55.005258image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:57.296400image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:49:59.545091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:03.589873image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:07.122086image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:10.103110image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:12.567689image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-05T15:50:14.847185image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-12-05T15:50:24.707961image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-05T15:50:25.000868image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-05T15:50:25.407775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-05T15:50:25.767810image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-12-05T15:50:25.965281image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-05T15:50:17.720566image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-05T15:50:18.301886image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexCompanyModel_nameModel_yearKilometersOwnerTransmissionRegistrationStateFuel_TypeMileageFuel_capacitySeating_capacityCityPrice
0081220169666011666124.735511323999
118552201766693111666226.635511482399
228551201440532111586120.535511367599
333111201560086011676117.840511701799
44893201629544011666120.743511682099
557340201949956011576017.3457111153499
6614362201849765001706117.044511801899
7714364201880038001666017.944511803299
883110201846497001676118.040511957799
998172201810340011666121.237511699499

Last rows

df_indexCompanyModel_nameModel_yearKilometersOwnerTransmissionRegistrationStateFuel_TypeMileageFuel_capacitySeating_capacityCityPrice
3488349112337201855408011919123.02856337699
34893492358201277735011979118.93556276499
3490349312336201912427012119125.02856429099
3491349412330201845942111929123.02856320999
34923495851202138321012119121.43756699499
349334964226201638453011979121.13256296599
3494349716383201752422111979116.54556554099
349534984226201640136011979121.13256284499
3496349982362017242614011929024.54576706199
3497350012336202042570111949125.02856376299